markovdecisionprocesses相关论文
Least-Squares Temporal Difference Learning with Eligibility Traces based on Regularized Extreme Lear
The task of learning the value function under a fixed policy in continuous Markov decision processes(MDPs)is considered.......
This talk discusses mean and variance problems in the context of finite horizon continuoustime Markov decision processes......